A Bioinformatics Resource for Crop Functional Genomics: GFSelector Module in Automated Annotation System, RiceGAAS

نویسندگان

  • Katsumi SAKATA
  • Hiroshi IKAWA
  • Hiroyuki WATANABE
  • Ikuo ASHIKAWA
  • Yuji SHIMIZU
  • Ikuo HORIUCHI
  • Baltazar A. ANTONIO
  • Hisataka NUMA
  • Yoshiaki NAGAMURA
  • Takashi MATSUMOTO
چکیده

GFSelector (Gene Function Selector, http://alnilam.mi.mss.co.jp/rgadb/) has been developed to perform computational classification of gene models and assignment of unique biological function. It has been incorporated in RiceGAAS (http://ricegaas.dna.affrc.go.jp/usr/) which was designed to provide an analysis pipeline for user submitted genome sequences and comprehensive database for all rice gene models. The combined system facilitates accurate modelling of predicted rice genes, classification of gene structure, and assigning of function and GO (gene ontology) terms to the gene models. The reliability and accuracy are enhanced by integrating several reference databases into the system and generating multiple candidates for determining the function of the gene models. The pipeline is also fully automated thereby facilitating regularly updates of the rice gene models using the latest reference databases. Annotation of soybean, wheat and banana BAC (bacterial artificial chromosome) sequences was performed to test the applicability of the pipeline to other crops. As compared with the GenBank CDS (coding sequence) features, more than 83% of nucleotide-level sensitivity was obtained for the gene modelling by the pipeline. It was also confirmed that 95% of functional annotation by the pipeline was nearly equal or better than the corresponding GenBank CDS feature. Discipline: Biotechnology Additional key words: database, gene ontology, web-based system †These authors contributed equally to this work.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

RiceGAAS: an automated annotation system and database for rice genome sequence

An extensive effort of the International Rice Genome Sequencing Project (IRGSP) has resulted in rapid accumulation of genome sequence, and >137 Mb has already been made available to the public domain as of August 2001. This requires a high-throughput annotation scheme to extract biologically useful and timely information from the sequence data on a regular basis. A new automated annotation syst...

متن کامل

B2G-FAR, a species-centered GO annotation repository

MOTIVATION Functional genomics research has expanded enormously in the last decade thanks to the cost reduction in high-throughput technologies and the development of computational tools that generate, standardize and share information on gene and protein function such as the Gene Ontology (GO). Nevertheless, many biologists, especially working with non-model organisms, still suffer from non-ex...

متن کامل

Functional Annotation of Two Hypothetical Proteins Reveals Valuable Proteins Involved in Response to Salinity: An in silico Approach

Through the exponential development in the specification of sequences and structures of proteins by genome sequencing and structural genomics approaches, there is a growing demand for valid bioinformatics methods to define these proteins function. In this study, our objective is to identify the function of unknown proteins from UCB-1 pistachio rootstock and specify their class...

متن کامل

FungiDB: an integrated functional genomics database for fungi

FungiDB (http://FungiDB.org) is a functional genomic resource for pan-fungal genomes that was developed in partnership with the Eukaryotic Pathogen Bioinformatic resource center (http://EuPathDB.org). FungiDB uses the same infrastructure and user interface as EuPathDB, which allows for sophisticated and integrated searches to be performed using an intuitive graphical system. The current release...

متن کامل

CoGenT++: an extensive and extensible data environment for computational genomics

MOTIVATION CoGenT++ is a data environment for computational research in comparative and functional genomics, designed to address issues of consistency, reproducibility, scalability and accessibility. DESCRIPTION CoGenT++ facilitates the re-distribution of all fully sequenced and published genomes, storing information about species, gene names and protein sequences. We describe our scalable im...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009